Migrating from the Recognition Component to the OCR Component
In ImageGear .NET v25, we’re introducing a new OCR component and removing the old Recognition component. If you’re upgrading to v25 or later from v24.x or earlier and use the Recognition component, then there are some steps you’ll need to take to migrate your code to use the new OCR component.
Mapping Recognition API to OCR API
The tables below delineate the OCR component API equivalents, where applicable.
Namespaces
ImageGear Recognition |
ImageGear OCR |
ImageGear.Recognition |
ImageGear.OCR |
Classes
ImageGear Recognition |
ImageGear OCR |
ImGearRecognition |
ImGearOCR |
ImGearRecAsianSettings |
Not implemented |
ImGearRecCheckWordEventArgs |
Not implemented |
ImGearRecCodePageCollection |
Not implemented |
ImGearRecDocument |
Not implemented |
ImGearRecImage |
ImGearOCRImage |
ImGearRecLetter |
ImGearOCRLetter |
ImGearRecModuleCollection |
Not implemented |
ImGearRecModuleSettings |
Not implemented |
ImGearRecMORSettings |
Not implemented |
ImGearRecOutputFormatCollection |
Not implemented |
ImGearRecOutputManager |
Not implemented |
ImGearRecPage |
ImGearOCRPage |
ImGearRecPDFOutputOptions |
ImGearOCRPDFOutputOptions |
ImGearRecPreprocessingSettings |
ImGearOCRPreprocessingSettings |
ImGearRecProgressEventArgs |
Not implemented |
ImGearRecRecognitionLanguageEnabled |
ImGearOCRLanguageEnabled |
ImGearRecRecognitionSettings |
ImGearOCRSettings |
ImGearRecSettingsCollectionObjectBase |
Not implemented |
ImGearRecStatistics |
Not implemented |
ImGearRecUDItem |
Not implemented |
ImGearRecUserDictionary |
ImGearOCRDictionary |
ImGearRecZone |
ImGearOCRZone |
ImGearRecZoneCollection |
ImGearOCRZoneCollection |
Enumerations
ImGearRecCheckWordOpinion |
Not implemented |
ImGearRecDecompositionMethod |
Not implemented |
ImGearRecDeskewMode |
ImGearOCRDeskewMode |
ImGearRecDirectTextFormat |
Not implemented |
ImGearRecErrorCodes |
Not implemented |
ImGearRecFillingMethod |
Not implemented |
ImGearRecFilter |
ImGearOCRFilter |
ImGearRecFontFlags |
Not implemented |
ImGearRecInvertMode |
ImGearOCRInvertMode |
ImGearRecLanguage |
ImGearOCRLanguage |
ImGearRecLicenseFeature |
Not implemented |
ImGearRecMakeupInfo |
Not implemented |
ImGearRecModule |
Not implemented |
ImGearRecMrcQuality |
Not implemented |
ImGearRecOrientationMode |
ImGearOCROrientationMode |
ImGearRecOutputCodePageType |
Not implemented |
ImGearRecOutputLevel |
Not implemented |
ImGearRecProcess |
Not implemented |
ImGearRecRecognitionModule |
Not implemented |
ImGearRecReductionMode |
ImGearOCRReductionMode |
ImGearRecResEnhancementMode |
Not implemented |
ImGearRecSpaceType |
Not implemented |
ImGearRecTradeoff |
Not implemented |
ImGearRecZoneCheckingFlags |
Not implemented |
ImGearRecZoneType |
Not implemented |
ImGearRecOrientationMode |
ImGearOCROrientationMode |
Structures
ImGearRecCodePage |
Not implemented |
ImGearRecOutputFormat |
Not implemented |
Delegates
ImGearRecCheckWordEventHandler |
Not implemented |
ImGearRecProgressEventHandler |
Not implemented |
Migrating Your Code to Use the New OCR
This section provides code examples that will help you migrate your code using the old Recognition component API to the new OCR component API.
Create the Recognition Object
Previously using Recognition |
Copy Code |
ImGearRecognition recognitionEngine = new ImGearRecognition(resourcePath); |
Where resourcePath is a path to the Recognition binaries.
Currently using OCR |
Copy Code |
ImGearOCR recognitionEngine = ImGearOCR.Create(resourcePath); |
Where resourcePath is a path to the OCR binaries.
Import Raster Page to the Recognition Engine
Previously using Recognition |
Copy Code |
ImGearRecPage recognitionPage = recognitionEngine.ImportPage((ImGearRasterPage)igPage); |
Currently using OCR |
Copy Code |
ImGearOCRPage ocrPage = recognitionEngine.ImportPage((ImGearRasterPage)igPage); |
Enable Recognition Languages
Previously using Recognition |
Copy Code |
foreach(ImGearRecLanguage lang in ImGearRecLanguage_collection)
recognitionEngine.Recognition.LanguageEnabled[lang] = true; |
Currently using OCR |
Copy Code |
foreach(ImGearOCRLanguage lang in ImGearOCRLanguage_collection)
recognitionEngine.Settings.LanguageEnabled[lang] = true; |
Specify Zones
Previously using Recognition |
Copy Code |
ImGearRecZone zone = new ImGearRecZone();
zone.Rect.Left = 0;
zone.Rect.Top = 0;
zone.Rect.Right = rasterPage.DIB.Width - 1;
zone.Rect.Bottom = rasterPage.DIB.Height - 1;
recognitionPage.Zones.Add(zone); |
Currently using OCR |
Copy Code |
ImGearOCRZone zone = new ImGearOCRZone();
zone.Rect.Left = 0;
zone.Rect.Top = 0;
zone.Rect.Right = rasterPage.DIB.Width - 1;
zone.Rect.Bottom = rasterPage.DIB.Height - 1;
recognitionPage.Zones.Add(zone); |
Explicit Preprocessing
Previously using Recognition |
Copy Code |
// Reduce to bitonal
recPage.Image.ReduceToBitonal(ImGearRecReductionMode.AUTO, 50, 0, ImGearRecResEnhancementMode.YES);
// Invert the image
recPage.Image.Invert();
// Despeckle
recPage.Image.Despeckle();
// Deskew and unorient
int slope;
ImGearRecOrientationMode orient;
recPage.Image.DetectSkew(out slope, out orient);
recPage.Image.Orient(orient);
recPage.Image.Deskew(slope); |
Currently using OCR |
Copy Code |
// Reduce to bitonal
ocrPage.Image.ReduceToBitonal(ImGearOCRReductionMode.AUTO, 50, 0)
// Invert the image
ocrPage.Image.Invert();
// Despeckle
ocrPage.Image.Despeckle();
// Deskew and unorient
int slope;
ImGearOCROrientationMode orient;
ocrPage.Image.DetectSkew(out slope, out orient);
ocrPage.Image.Orient(orient);
ocrPage.Image.Deskew(slope); |
See ReduceToBitonal method for more information.
Implicit Preprocessing
Previously using Recognition |
Copy Code |
rec.Preprocessing.OrientationMode = ImGearRecOrientationMode.AUTO;
rec.Preprocessing.DeskewMode = ImGearRecDeskewMode.AUTO;
rec.Preprocessing.DespeckleMode = true;
rec.Preprocessing.InversionMode = true;
recPage.Image.Preprocess(); |
Currently using OCR |
Copy Code |
ocr.Preprocessing.OrientationMode = ImGearOCROrientationMode.AUTO;
ocr.Preprocessing.DeskewMode = ImGearOCRDeskewMode.AUTO;
ocr.Preprocessing.DespeckleMode = true;
ocr.Preprocessing.InversionMode = true;
ocrPage.Image.Preprocess(); |
Reducing to bitonal is always performed in OCR when ocrPage.Image.Preprocess(); is called.
Recognize the Page
Recognition |
Copy Code |
recPage.Recognize(); |
Currently using OCR |
Copy Code |
ocrPage.Recognize(); |
Get Recognition Result as an Array of Letters
Previously using Recognition |
Copy Code |
ImGearRecLetter[] letters = recPage.GetLetters();
// ... do something with letters |
Currently using OCR |
Copy Code |
ImGearOCRLetter[] letters = ocrPage.GetLetters();
// ... do something with letters |
Get Direct Text
The functionality of Direct Text output is similar to that in previous versions of ImageGear. Although ImGearRecOutputManager was removed from the product, all overrides of the WriteDirectTextOutput method may be found in the ImGearOCR object:
Currently using OCR |
Copy Code |
void WriteDirectText(ImGearOCRPage, string);
void WriteDirectText(ImGearOCRPage[], string);
void WriteDirectText(ImGearOCRPage, Stream);
void WriteDirectText(ImGearOCRPage[], Stream); |
The property ImGearRecOutputManager.DirectTextFormat has been migrated into ImGearOCRSettings. The type of this property still contains four values: SimpleText, CommaSeparatedText, FormattedText and XmlWithCoordinates. The formats of SimpleText, CommaSeparatedText and FormattedText are the same. XmlWithCoordinates was changed but contains the same information.
Get the Output as PDF Page and Save it to PDF File
Previously using Recognition |
Copy Code |
using (ImGearPDFDocument pdfDocument = new ImGearPDFDocument())
{
recPage.CreatePDFPage(pdfDocument, somePDFOptions);
pdfDocument.Save(outputPDFPath, ImGearSavingFormats.PDF, 0, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearSavingModes.OVERWRITE);
} |
Currently using OCR |
Copy Code |
using (ImGearPDFDocument pdfDocument = new ImGearPDFDocument())
{
ocrPage.CreatePDFPage(pdfDocument, somePDFOptions);
pdfDocument.Save(outputPDFPath, ImGearSavingFormats.PDF, 0, 0, (int)ImGearPDFPageRange.ALL_PAGES, ImGearSavingModes.OVERWRITE);
} |
Exception Handling
The new OCR component implements the exception class ImGearOCRException. This class is derived from ImGearException and indicates if a failure occurred during OCR operations.
See also the information on error codes and error handling.
Currently using OCR |
Copy Code |
try
{
// Some OCR operation
}
catch(ImGearOCRException ex)
{
Console.WriteLine("OCR-specific exception: " + ex.Message);
}
catch(Exception ex)
{
Console.WriteLine("Generic exception: " + ex.Message);
} |